-
Notifications
You must be signed in to change notification settings - Fork 421
feat(trigger): data class and helper functions for lambda trigger events #159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Put together an intial set of common lambda trigger events
Codecov Report
@@ Coverage Diff @@
## develop #159 +/- ##
===========================================
Coverage 100.00% 100.00%
===========================================
Files 33 46 +13
Lines 918 2070 +1152
Branches 77 90 +13
===========================================
+ Hits 918 2070 +1152
Continue to review full report at Codecov.
|
Tracked down some of the AWS docs to inline with the dict wrapper classes.
For cases when the dict name conflicts with a python builtin we should use a consistent method name
Add support for UserMigrationTriggerEvent and include better docstrings
Add support for CustomMessageTriggerEvent and PreAuthenticationTriggerEvent
Add support for PreTokenGenerationTriggerEvent and PostAuthenticationTriggerEvent
Changes: * Add some missing docs pull for various AWS Docs * Fix SQS mapping * Make docs consistent
API Gateway events? |
Sure. I can add that. |
@jplock - i added support for api gateway proxy events. |
aws_lambda_powertools/utilities/trigger/cloud_watch_logs_event.py
Outdated
Show resolved
Hide resolved
For both SNS and S3 event notifications we actually include a single records, so we can have conveince metods to access popular resources
|
||
class CloudWatchLogsLogEvent: | ||
def __init__(self, log_event: dict): | ||
self._val = log_event |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before you assign the event, it's a good idea to check first if the event has the key/values that you expect and if the event version is the version that you can handle.
Current event formats: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch-and-eventbridge.html
Sample event format:
{
"version": "0",
"id": "c4c1c1c9-6542-e61b-6ef0-8c4d36933a92",
"detail-type": "CloudWatch Alarm State Change",
"source": "aws.cloudwatch",
"account": "123456789012",
"time": "2019-10-02T17:04:40Z",
"region": "us-east-1",
"resources": [
"arn:aws:cloudwatch:us-east-1:123456789012:alarm:ServerCpuTooHigh"
],
"detail": {
"alarmName": "ServerCpuTooHigh",
"configuration": {
"description": "Goes into alarm when server CPU utilization is too high!",
"metrics": [
{
"id": "30b6c6b2-a864-43a2-4877-c09a1afc3b87",
"metricStat": {
"metric": {
"dimensions": {
"InstanceId": "i-12345678901234567"
},
"name": "CPUUtilization",
"namespace": "AWS/EC2"
},
"period": 300,
"stat": "Average"
},
"returnData": true
}
]
},
"previousState": {
"reason": "Threshold Crossed: 1 out of the last 1 datapoints [0.0666851903306472 (01/10/19 13:46:00)] was not greater than the threshold (50.0) (minimum 1 datapoint for ALARM -> OK transition).",
"reasonData": "{\"version\":\"1.0\",\"queryDate\":\"2019-10-01T13:56:40.985+0000\",\"startDate\":\"2019-10-01T13:46:00.000+0000\",\"statistic\":\"Average\",\"period\":300,\"recentDatapoints\":[0.0666851903306472],\"threshold\":50.0}",
"timestamp": "2019-10-01T13:56:40.987+0000",
"value": "OK"
},
"state": {
"reason": "Threshold Crossed: 1 out of the last 1 datapoints [99.50160229693434 (02/10/19 16:59:00)] was greater than the threshold (50.0) (minimum 1 datapoint for OK -> ALARM transition).",
"reasonData": "{\"version\":\"1.0\",\"queryDate\":\"2019-10-02T17:04:40.985+0000\",\"startDate\":\"2019-10-02T16:59:00.000+0000\",\"statistic\":\"Average\",\"period\":300,\"recentDatapoints\":[99.50160229693434],\"threshold\":50.0}",
"timestamp": "2019-10-02T17:04:40.989+0000",
"value": "ALARM"
}
}
}
Keys that can be checked before we initialize the object:
- version
- source
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@koiker - I did not know cloud watch log events had any such variations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@koiker Those docs are for cloudwatch events sent to EventBridge. This class is handling cloudwatch log events and the encoded data.
aws_lambda_powertools/utilities/trigger/api_gateway_proxy_event.py
Outdated
Show resolved
Hide resolved
Use new DictWrapper abstract class for the data classes that wrap an event Dict
@heitorlessa - would it help to include helper methods which help with common tasks eg: For SES events you might want to fetch the content from S3 by messageId and parse out the import email
import quopri
# Configured S3 bucket name
s3_bucket = "..."
# Configured Object key prefix
key_prefix = "..."
def get_message_contents(message_id: str, content_type: str = "text/html") -> str:
""" Load the email from S3 and then search through the
message_contents for the by content type
:param message_id: Is used for the key to look up the message on S3
:param content_type: Content type of the payload to return defaults to `text/html`
:return: Returns the email contents matching the specified content_type
"""
try:
obj = s3.Object(s3_bucket, key_prefix + message_id)
message_contents = obj.get()["Body"].read().decode("utf-8")
except ClientError as e:
raise Exception("S3 ClientError:", e.response["Error"]["Message"])
else:
email_message = email.message_from_string(message_contents)
if email_message.is_multipart():
# Get matching content_type payload
return next(
(
quopri.decodestring(part.get_payload()).decode()
for part in email_message.walk()
if content_type in part.get_content_type()
),
None,
)
else:
# Fail over to payload
return email_message.get_payload() |
Add support for Kinesis stream events with a helper method to decode data
@cakepietoast @nmoutschen could you do a full review here sometime this week? I'll have a quick look tomorrow but would appreciate some fresh eyes here while I can't dedicate some proper time to this yet. PS: I'm raising a feature request for Lambda to publish a schema for all event sources so we could auto-generate most of it in the future (including for JS). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Firstly, a BIG thank you @michaelbrewer -- It's a lot of work and thought that you put on here, we deeply appreciate it.
Small comment for now on data as json/decoded for CW Logs.
I need some time to think on the design - I don't want to rush, so give me a few days to pull this and play with it some more.
At a first glance, I'd like to see us having properties for the absolutely minimal necessary, where the additional docstring and getters pay off in terms of maintenance, user experience, and adding a new event source.
From the top of my head, I see a big value in having getters for
- Nested keys - It's painful to remember more than one key only to fetch X data
- Data transformation like you did on Kinesis b64, CloudWatch Logs b64+gzip+optionally json, Dynamo etc.
- Ambiguous properties that would benefit from additional explanations since their names or values can be easily misinterpreted
def decode_cloud_watch_logs_data(self) -> CloudWatchLogsDecodedData: | ||
"""Decode, unzip and parse json data""" | ||
payload = base64.b64decode(self.aws_logs_data) | ||
decoded: dict = json.loads(zlib.decompress(payload, zlib.MAX_WBITS | 32).decode("UTF-8")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we either make JSON deserialisation as another property, or optionally off?
Since this is a data class utility, it's possible they'll have non-JSON data too.
I'm fine having it on by default, though -- This will likely be true for Kinesis and API GW binary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@heitorlessa ok i split it up.
One extra bit... if, however, the point is to expose all keys regardless for auto completion then it's simply a trade off we'll have to agree on as we add more event sources. This would make sense if the end goal is to make this a separate package that other projects could benefit from. |
Not yet - I'd like to see more demand on these beforehand. These (SES example) would also require IAM permissions, so there's more at play |
@heitorlessa maybe once we have the schema it would replace the @Property access methods. With some very basic logic to generate python safe names from the schema. But I would like to keep that could in place with some warning that is might change later. For getters and decoding it can just be for a small subset of frequently used elements. Like query string, headers, path and body decoding for proxies api calls. |
Maybe for this we just have a helper method when you can provide the contents of the s3 object. |
Create BaseProxyEvent for some reusable code for the http proxy events Add support for ALB events
For handling CloudWatchLogsEvent log data split the Decode and decompress from the parse as CloudWatchLogsDecodedData
I also included ALB events and there is a BaseProxyEvent class that has the shared code between all three. |
This pull request introduces 1 alert when merging a746ddc into 063d8cd - view on LGTM.com new alerts:
|
Sure, that would be fine. |
This pull request introduces 1 alert when merging 7b4ec44 into 063d8cd - view on LGTM.com new alerts:
|
We don't need to include `_decompressed_logs_data` and `_json_logs_data` in equality tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets change the "trigger" naming convention to something more descriptive. My suggestion is data_classes
so instead of from aws_lambda_powertools.utilities.trigger import DynamoDBStreamEvent
we can do from aws_lambda_powertools.utilities.data_classes import DynamoDBStreamEvent
@cakepietoast - I just choose this name as it was based on what I saw in other libraries. More accurately it is the data classes for lambda trigger events. I will update this and make the other requested changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this, just need to update the documentation before we release it. I'd be glad to contribute the docs piece, let me know @michaelbrewer.
Let's create another PR for the docs - Implementation is solid, @cakepietoast is also happy with the direction. Feel free to merge @cakepietoast when you can As always, huge thanks @michaelbrewer for working on this. |
Issue #, if available: #164
Description of changes:
Each Lambda trigger event there is a corresponding typed data class which doc strings extracted from the AWS documentation, some helper methods to updating values as well as decoding any embedded data. In the test suite there is alot a series of example trigger events.
Example screenshot of the docs:
CloudWatchLogsEvent example usage:
PreTokenGenerationTriggerEvent example usage:
S3Event example usage:
APIGatewayProxyEvent example (V1 and V2 are supported):
KinesisStreamEvent example with decoding data:
Checklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.